United States Patent m 

Maki 



[li] 

[45] 



US005729707A 
Patent Number: 
Date of Patent: 



5,729,707 
Man 17, 1998 



[54] INSTRUCTION PREFETCH CIRCUIT AND 
CACHE DEVICE WITH BRANCH 
DETECTION 



4-348430 
5-88891 



12/1992 
4/1993 



[75] Inventor: 
[73] 



Kazuhiko Maki, Tokyo. Japan 



Assignee: Oki Electric Industry Co., Ltd.. 
Tokyo. Japan 



[21] Appl. No.: 539,683 

[22] Filed: Oct 5, 1995 

[30] Foreign Application Priority Data 

Oct. 6, 1994 [JP] Japan 6-242553 

[51] IntCl. 6 G06F9/38 

[52] U.S. CL 395/383; 395/581; 395/582 

[58] Field of Search 395/383, 581, 

395/582 

[56] References Cited 

U.S. PATENT DOCUMENTS 

4.847,753 7/1989 Matsuo et al 395/585 

4 3 881,170 n/1989 Morisada 395/383 

4.984,154 1/1991 Hanatani ct al 395/587 

5,283,873 2/1994 Steely, Jr. et al 395/383 

5,327,536 7/1994 Suzuki 395/585 

5,507,028 4/1996 Liu 395/383 

5,542,109 7/1996 Blomgren et al 395/581 

5,586,278 12/1996 Papworth et'al 395/582 



FOREIGN PATENT DOCUMENTS 



0 320098A3 
23 27 315.6 
58-129660 
63-170740 
2-144626 
3-84630 



6/1989 
2/1974 
8/1983 
7/1988 
6/1990 
4/1991 



European Pal. Off. . 
Germany . 
Japan . 
Japan . 
Japan . 
Japan . 



Japan . 
Japan . 



OTHER PUBLICATIONS 



Scott McFarling and John Hennessy, Reducing the Cost of 

Branches, 1986, IEEE pp. 396-^03. 

Johnny 1C F. Lee and Alan Jay Smith, Branch Prediction 

Strategies and Branch Target Buffer Design, Jan. 1984, pp. 

6-22. 

IBM Technical Disclosure Bulletin, Efficient Scheme to 
Reduce Over-Prefetching of Instructions for Loading an 
Instruction Buffer, 1990, pp. 423-425. 

Primary Examiner— Krisna Lira 

Attorney, Agent, or Firm—Rabin, Champagne & Lynt, P.C. 



[57] 



ABSTRACT 



Id an instruction prefetch circuit, even when a branch 
instruction is prefetched, the circuit continues a prefetch 
operation until branching is actually executed. Accordingly, 
when the branch instruction is a conditional branch instruc- 
tion and not actually executed, the circuit continues the 
prefetch operation so that the prefetched instructions are 
efficiently supplied to a processor. It may be arranged that 
when the branch instruction is ah unconditional branch 
instruction, a branch destination address is extracted from 
the unconditional branch instruction and used as a prefetch 
address. Accordingly, the circuit continues the prefetch 
operation even when branching is executed. It may further 
be arranged that, when the branch instruction is a conditional 
branch instruction, a branch destination address is extracted 
from the conditional branch instruction and further a branch 
prediction is performed. When branching is expected based 
on the branch prediction, the branch destination address is 
used as a prefetch address. Accordingly, as long as the 
branch prediction does not fail, the circuit continues the 
prefetch operation. 

6 Claims, 10 Drawing Sheets 
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INSTRUCTION PREFETCH CIRCUIT AND the output from the register 12 and outputs it to the matcher 

C ACHE DEVICE WITH BRANCH 32. The matcher 32 compares the outputted address from the 

DETECTION tag register 31 with the matcher address FA from the CPU. 

If the address outputted from the tag register 31 matches or- 

BACKGROUND OF THE INVENTION 5 agrees with the fetch address FA, the marcher 32 and AND 

gate 60 produce the signal VALID at level" 1". On the other 

1. Field of the Invention hand ^ tf negative) me matcher 32 and AND gate 60 produce 
The present invention relates to an instruction prefetch the signal VAT TP at level "0". The signal VALID is sent to 

circuit and a cache device for accelerating the process in an the CPU through AND gate 60 which produces the logical 

information processing system and the like. 10 combination of the matcher 32 output and the predecoder 51 

2. Description of the Prior Art output through flip-flop 62. 

FIG. 11 is a block diagram showing a conventional Now, an operation of the instruction prefetch circuit 

instruction prefetch circuit. shown in FIG. 11 will be explained hereinbelow with 

The shown instruction prefetch circuit is connected to a reference to a timing chart shown in FIG. 12. 

CPU (central processing unit) of a microcomputer or the like 15 In FIG. 12. the fetch signal FA, the prefetch address 

for fetching in advance, that is, prefetching instruction data loaded in the register 12, the instruction data D sent from the 

D required by the CPU at the next cycle so as to increase-the register 40 and the signals VALID, SEL0 and RST are 

instruction executing speed. The shown instruction prefetch shown. When the reset signal RST is "1", the signal SEL0 

circuit may also apply to a cache device. In this case, the of the RS-FF 54 working as the load selection signal for the 

cache device prefetches instruction data D so as to accelerate 20 register 12 is reset to "0". At this stage, the register 40 does 

operations thereof. not output effective data. When the reset signal RST 

In FIG. 11, the instruction prefetch circuit includes an becomes "0", the fetch signal FA is loaded in the register 12. 

address generating section 10 which receives a signal FA While the selection signal SELO is "1", the circuit of FIG. U 

indicative of a fetch address sent from the CPU (not Shown) „ °P CTates in a P refetch mode so that * e re S ister 12 continues 

and generates a prefetch address, a memory 20 which is 25 to be loaded with incremented addresses. The memory 

connected to an output side of the address generating section 20 0Ut P uts to ^ * & & sttr 40 the D corresponding to the 

10 and stores instructions corresponding to addresses, and a address inputted from the register 12. 

determination signal generating section 30 which is also If the instruction data D corresponding to (2) in FIG. 12 

connected to the output side of the address generating 3Q is a branch instruction, the predecoder 51 detects it from the 

section 10 and outputs to the CPU a signal VALID indicative output of the register 40 and the prefetch mode is released, 

of whether the instruction data D output to the CPU is valid Thereafter, an address A representative of a branch destina- 

or invalid. To an output side of the memory 20 is connected tion address appears in the fetch signal FA from the CPU. 

a data register 40, as a data sending section, which holds the T ne branch destination address A is loaded in the register 12 

data D read out from the memory 20 and outputs it to the 35 and the circuit starts to operate in the prefetch mode again 

CPU. A predecoder 51 is connected to an output side of the as seen from FIG. 12. 

data register 40. An output side of the predecoder 51 is However, the foregoing conventional instruction prefetch 

connected to one of input terminals of a two-input OR gate circuit has the following problem: 

52 through flip-flop 62. A reset signal RST is selectively The prefetch operation of the circuit is stopped even when 

inputted to the other input terminal of the OR gate 52 under ^ the prefetched instruction detected by the predecoder 51 is 

the control of the CPU. The instruction prefetch circuit a conditional branch instruction, to say nothing of an uncon- 

further includes a synchronous RS-FF (reset-set flip-flop) 54 ditional branch instruction. As a result, even when the 

which receives- the signal VALE) via an inverter 53, An- - conditional branch instruction is not established or executed,, 

output side of the OR gate 52 is connected to a reset terminal the prefetch operation is once stopped so that a waiting time 

of the RS-FF 54 so that the RS-FF S4 is reset by a reset 45 i s unnecessarily caused to lower the processing performance 

signal R fed from the OR gate 52. Further, an output SELO of the CPU. 
of the RS-FF 54 is inputted to the address generating section 

10. SUMMARY OF THE INVENTION 

The address generating section 10 includes a selector 11 Therefore, it is an object of the present invention to 

having an input terminal which receives the fetch signal FA 50 provide an improved instruction prefetch circuit or cache 

from the CPU. The address generating section 10 further device. 

includes a prefetch address register 12 connected to an According to one aspect of the present invention, an 

output side of the selector 11, and an incrementer 13 which instruction prefetch circuit comprises a memory prestoring 

receives an output of the register 12. The incrementer 13 instructions to be used in a processor, the memory prestoring 

adds 1 (one) to an address outputted from the register 12 and 55 the instructions corresponding to addresses; address gener- 

outputs the address incremented by one to another input ating means for incrementing a first instruction address 

terminal of the selector 11. The selector 11 further receives inputted from the processor so as to generate a prefetch 

the output signal SELO from the RS-FF 54 and selects one address for sending to the memory; data sending means for 

of the fetch signal FA from the CPU and the output signal reading out the instruction corresponding to the prefetch 

from the incrementer 13 depending on the output signal ^ address from the memory and for sending the read- out 

SELO for feeding to the register 12. Accordingly, the output instruction to the processor; determination signal generating 

signal SELO from the RS-FF 54 works as a load selection means for detecting whether or not the prefetch address 

signal for the register 12. The register 12 is loaded with the agrees with a second instruction address inputted from the 

selected address outputted from the selector 11 and outputs processor after the first instruction address, the determina- 

the loaded address to the memory 20 as a prefetch address. 65 t ion signal generating means supplying to the processor a 

The determination signal generating section 30 includes a determination signal indicative of the instruction sent from 

tag register 31 and a matcher 32. The tag register 31 holds the data sending means being valid when the prefetch 
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address agrees with the second instruction address, while 
supplying to the processor a determination signal indicative 
of the instruction sent from the data sending means being 
invalid when the prefetch address disagrees with the second 
instruction address; and selection signal generating means 5 
for sending a selection signal to the address generating 
means, the selection signal causing the address generating 
means to select the second instruction address as a new 
prefetch address other than an address derived by increment- 
ing the first instruction address, only upon change of the 10 
determination signal from valid to invalid. 

According to another aspect of the present invention, an 
instruction prefetch circuit comprises a memory prestoring 
instructions to be used in a processor, the memory prestoring 
the instructions corresponding to addresses; address gener- 15 
ating means for incrementing a first instruction address 
inputted from the processor so as to generate a prefetch 
address for sending to the memory; data sending means for 
reading out the instruction corresponding to the prefetch 
address from the memory and for sending the read-out 20 
instruction to the processor; determination signal generating 
means for detecting whether or not the prefetch address 
agrees with a second instruction address inputted from the 
processor after the first instruction address, the determina- 
tion signal generating means supplying to the processor a 25 
determination signal indicative of the instruction sent from 
the data sending means being valid when the prefetch 
address agrees with the second instruction address, while 
supplying to the processor a determination signal indicative 
of the instruction sent from the data sending means being 30 
invalid when the prefetch address disagrees with the second 
instruction address; means for detecting that the instruction 
sent from the data sending means is an unconditional branch 
instruction; and means for extracting a branch destination 
address from the unconditional branch instruction; the 35 
address generating means sending the branch destination 
address to the memory as a new prefetch address when the 
instruction sent from the data sending means is the uncon- 
ditional branch instruction. 

According to another aspect of the present invention, an 40 
instruction prefetch circuit comprises a memory prestoring 
instructions to be used in a processor, the memory prestoring 
the~mstructions corresponding to addresses; address gener- - 
ating means for incrementing a first instruction address 
inputted from the processor so as to generate a prefetch 45 
address for sending to the memory; data sending means for 
reading out the instruction corresponding to the prefetch 
address from the memory and for sending the read-out 
instruction to the processor; determination signal generating 
means for detecting whether or not the prefetch address 50 
agrees with a second instruction address inputted from the 
processor after the first instruction address, the determina- 
tion signal generating means supplying to the processor a 
determination signal indicative of the instruction sent from 
the data sending means being valid when the prefetch 55 
address agrees with the second instruction address, while 
supplying to the processor a extermination signal indicative 
of the instruction sent from the data sending means being 
invalid when the prefetch address disagrees with the second 
instruction' address; means for detecting that the instruction 60 
sent from the data sending means is a conditional branch 
instruction instructing a conditional branch; means for 
extracting a branch destination address from the conditional 
branch instruction; prediction means for predicting whether 
or not the conditional branch is actually executed; selection 65 
means for selecting, based on the branch prediction, one of 
an address derived by incrementing the first instruction 
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address and the branch destination address; and means for 
outputting, based on a signal indicative of whether the 
branch prediction fails or not, one of the address selected by 
the selection means and a third instruction address being 
inputted from the processor, to the address generating means 
as a prefetch address. 

It may be arranged that the prediction means extracts a 
branch prediction bit in an instruction field of the conditional 
branch instruction and uses it as a prediction value. 

It may be arranged that the prediction means uses a 
random value as a prediction value. 

It may be arranged that the prediction means includes a 
table storing a past branch history of the conditional branch 
instruction and derives a prediction value based on the past 
branch history. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will be understood more fully from 
the detailed description given hereinbelow, taken in con- 
junction with the accompanying drawings. 
In the drawings: 

FIG. 1 is a block diagram showing an instruction prefetch 
circuit according to a first preferred embodiment of the 
present invention; 

FIG. 2 is a time chart for explaining an operation of the 
instruction prefetch circuit shown in FIG. 1; 

FIG. 3 is a block diagram showing an instruction prefetch 
circuit according to a second preferred embodiment of the 
present invention; 

FIG. 4 is a block diagram showing a structure of a 
predecoder shown in FIG. 3; 

FIG. 5 is a time chart for explaining an operation of the 
instruction prefetch circuit shown in FIG. 3; 

FIG. 6 is a block diagram showing an instruction prefetch 
circuit according to a third preferred emrxxiiment of the 
present invention; 

FIG. 7 is a block diagram showing a circuit for generating 
a signal MISS which is used in the instruction prefetch 
circuit shown in FIG- 6; 

FIG. 8 is a-block diagram showing an.instruction .prefetch 
circuit according to a fourth preferred embodiment of the 
present invention; 

FIG. 9 is a block diagram showing branch prediction 
means used in the instruction prefetch circuit shown in FIG. 
8; 

FIG. 10 is a diagram showing branch prediction means 
used in an instruction prefetch circuit according to a fifth 
preferred embodiment of the present invention; 

FIG. 11 is a block diagram showing a conventional 
instruction prefetch circuit; and 

FIG. 12 is a time chart for explaining an operation of the 
conventional instruction prefetch circuit shown in FIG. 11. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

Now, preferred embodiments of the present invention will 
be described hereinbelow with reference to the accompany- 
ing drawings. 
First Embodiment 

FIG. 1 is a block diagram showing an instruction prefetch 
circuit according to a first preferred embodiment of the 
present invention. In FIG. 1. the same or like elements arc 
represented by the same symbols as those in FIG. 11. 
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Like the circuit shown in FIG. 11, the instruction prefetch In FIG. 2, the fetch signal FA, the prefetch address loaded 

circuit of FIG. 1 includes an address generating section 10 in the register 12, the instruction data D sent from the 

which receives a signal FA indicative of a fetch instruction register 40 and the signals VALID, SEL1 and RST are 

address sent from a CPU (not shown) and generates a shown. When the reset signal RST is *T\ the selection 

prefetch address, a memory 20 which is connected to an 5 signal SEL1 of the AND gate 65 is reset to "0". At this stage, 

output side of the address generating section 10 and stores tne reg ister 40 does not output effective data. When the reset 

instructions, to be used in the CPU, corresponding to signal RST becomes "0", the fetch signal FA is loaded in the 

addresses, a determination signal generating section 30 register 12 and me selecuon signal SEL1 is set to "1". While 

which is also connected to the output side of the address the selcctioD sigIial SEL1 is « r . Ac circuit of nG . x 

ff^^V** 1 ™ }° 'J*? °T UXS 10 ? C a * n $ Tf »> operates in a prefetch mode so that the register 12 continues 

y^^r^T 6 ° f * rn T to be loaded with the incremented addresses. Hie memory 

ted to the CPU is valid or invalid, and a data register 40 f 

which is connected to an output side of the memor} 20 and 20 outputs to the register 40 the data D corresponding to the 

holds the data D read out from the memory 20 for feeding ad ^ e f inputted from the register 12 

to the CPU instruction data D corresponding to (2) in FIG. 2 is 

In the instruction prefetch circuit of this embodiment, 15 an unconditional branch instruction, or a conditional branch 

selection signal generating means 60 is provided instead of instruction and the branching is executed, the fetch address 

the predecoder 51 and the associated elements in FIG. 11. FA f™ m the CPU and the address outputted from the register 

The selection signal generating means 60 produces a selec- 12 differ from each other so that the matcher 32 changes a 

tion signal SEL1 which changes its level upon change of the level of the signal VALID from 4 T' to "0" which represents 

signal VALID from 'Valid" to invalid" so as to switch an 20 "invalid". Specifically, when the branching is executed, the 

address to be loaded at the address generating section 10 to fetch address FA becomes A while the address outputted 

the fetch address FA sent from the CPU, which will be . from the register 12 becomes 4 in FIG. 2 so that both 

described later in detail. addresses differ from each other to cause the matcher 32 to 

The address generating section 10 includes a selector 11 change the level of the signal VALID from "1" to "0". This 

having an input terminal which receives the fetch signal FA 25 change in level of the signal VALID is detected by the 

from the CPU. The address generating section 10 further operations of the D-FF 63 and the AND gate 64 so as to 

includes a prefetch address register 12 connected to an forcibly change the selection signal SEL1 from "1" to **0 M at 

output side of the selector 11, and an incrementer 13 which one cycle as shown in FIG, 2, Specifically, it is so arranged 

receives an output of the register 12. The incrementer 13 that, when the signal VALID is changed from "1" to "0", a 

adds 1 (one) to an address outputted from the register 12 and 30 level at the input terminal of the AND gate 64 which 

.outputs the address incremented by one to another input receives the inverted value of the signal VALID changes 

terminal of the selector 11, The selector 11 further receives from *'(T to "1", while a level at the input terminal of the 

the output signal SEL1 and selects one of the fetch signal FA AND gate 64 connected to the D-FF 63 remains "1" at one 

from the CPU and the output signal from the incrementer 13 cycle due to the latching operation of the D-FF 63. Thus, a 

depending on the output signal SEL1 for feeding to the 35 level at the input terminal of the AND gate 65 connected to 

register 12. The register 12 is loaded with the selected the AND gate 64 becomes "0" while a level at the input 

address outputted from the selector 11 and outputs the terminal of the AND gate 65 connected to the RS-FF 62 

loaded address to the memory 20 as a prefetch address. remains "1" so that the selection signal SEL1 becomes "0" 

The determination signal generating section 30 includes a at one cycle as shown in FIG. 2. 

tag register 31 and a matcher 32. The tag register 31 holds 40 Accordingly, the selector 11 selects the fetch address FA 

the output from the register 12 and outputs it to the matcher for feeding to the register 12 so that the register 12 is loaded 

32. Tne matcher 32 compares the outputted address from the with the address A and the prefetch operation is performed 
tag register 3 1~ with the fetch address FA fronrthe CPU. If— again as shown -in FIG. 2. As appreciated,-when the instruc- 

the address outputted from the tag register 31 matches or tion data D. corresponding to (2) in FIG. 2 is the conditional 

agrees with the fetch address FA. the matcher 32 produces 45 branch instruction and the branching is not executed, the 

the signal VALID at level "1". On the other hand, if signal VALID continues to represent * Valid" as opposed to 

negative, the martcher 32 produces the signal VALID at the foregoing description so that the prefetch operation 

level v 0'\ The signal VALID is sent to the CPU. continues to be performed normally. 

The selection signal generating means 60 includes an In the first preferred embodiment, as described above, 

RS-FF 62 which is inputted with the signal VALID via an 50 even when the branch instruction is prefetched, the prefetch 

inverter 61, and a D-FF 63 (data-delay flip-flop) which is operation continues until discrepancy between the fetch 

directly inputted with the signal VALID. The RS-FF 62 has address FA and the address outputted from the register 12 

a reset terminal R which is selectively inputted with a reset actually occurs to cause the matcher 32 to change the signal 

signal RST for setting a reset or set mode. On the other hand. VALID from 'Valid" to "invalid". Accordingly, when the 

the D-FF 63 is provided for latching a state of die signal 55 instruction is the conditional branch instruction and the 

VALID and has an output terminal connected to one of input branching is not executed, the instruction can be fed to the, 

terminals of a two T input AND gate 64. The other input CPU efficiently as compared with the conventional instruc- 

terminal of the AND gate 64 is inputted with an inverted tion prefetch circuit shown in FIG. 11. Further, since the 

value of die signal VALID, An output signal of the AND gate instruction does not need to be predecoded as in the circuit 

64 is inverted and inputted to one of input terminals of a 60 of FIG. 11, the hardware can be simplified and the process 

two-input AND gate 65. Further, an output terminal of the delay can be avoided. Further, when the circuit shown in 

RS-FF 62 is connected to the other input terminal of the FIG. 1 is applied to an advance-writable cache device, since 

AND gate 65. The selection signal SEL1 is supplied to the update can be continued to increase a hit ratio of the device, 

selector 11 from the AND gate 65. the performance thereof is improved. 

Now. an operation of the instruction prefetch circuit 65 Second Embodiment 

shown in FIG. 1 will be explained hereinbelow with refer- FIG. 3 is a block diagram showing an instruction prefetch 

ence to a time chart shown in FIG. 2. circuit according to a second preferred embodiment of the 
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present invention. In FIG. 3, the same or like elements are 
represented by the same symbols as those in FIG. 1. 

As shown in FIG. 3, the instruction prefetch circuit of this 
embodiment further includes a predecoder 71, a selector 72 
and an AND gate 73. The other structure is essentially the 
same as that shown in FIG. 1. 

The predecoder 71 is connected to an output side of the 
data resister 40. The predecoder 71 is provided for detecting 
that the prefetched instruction D from the data register 40 is 
an unconditional branch instruction and for extracting a 
branch destination address from the unconditional branch 
instruction, The selector 72 is provided between an output 
side of the incrementer 13 and an input side of the selector 
11. The selector 72 receives the incremented address from 
the incrementer 13 and further receives the branch destina- 
tion address from the predecoder 71 when the prefetched 
instruction D is the unconditional branch instruction. The 
selector 72 selects one of them for feeding to the selector 11. 

A detection signal, indicative of wti ether the instruction 
data D is the unconditional branch instruction, of the pre- 
decoder , 71 is inputted to one of input terminals of a 
two-input AND gate 73. To the other input terminal of the 
AND gate 73 is inputted the signal VALID. The AND gate 
73 outputs a selection signal SEL2 to the selector 72 which 
switches selection between the addresses from the incre- 
menter 13 and the predecoder 71 based on the selection 
signal SEL2. On the other hand, the selection signal SEL1 
is inputted to the selector 11. In this preferred embodiment; 
the selection signal SEL1 is produced using the selection 
signal generating means 60 in the first preferred 
embodiment, but may be produced using the predecoder 51 
in the conventional circuit shown in FIG. 11. 

FIG. 4 is a block diagram showing a structure of the 
predecoder 71. 

The predecoder 71 includes a coincidence circuit 71-1. 
The coincidence circuit 71-1 extracts, for example, 6 bits 
representing a feature of the instruction outputted from the 
data register 40 for detecting coincidence with a preselected 
bit feature of the unconditional branch instruction. The 
coincidence circuit 71-1 outputs a coincidence signal S71-1 
as a result of the detection to the AND gate 73. The 
predecoder 71 further includes an address extracting register 
71-2 and a bit extender 71-3. The register 71-2 is inputted 
"with, for example, 26 bits showing an address of an instruct 
tion field of the instruction outputted from the data register 
40 so as to fetch a branch destination address designated by 
the instruction when the signal S71-1 indicates "coinci- 
dence". A stream of 26 bits outputted from the register 71-2 
is extended to a stream of 32 bits by the bit extender so as 
to be fed to the selector 72. 

Now, an operation of the instruction prefetch circuit 
shown in HG. 3 will be explained hereinbelow with refer- 
ence to a time chart shown in FIG. 5. 

In FIG. 5, when, for example, the instruction data D 
corresponding to (2) is the unconditional branch instruction, 
the predecoder 71 detects it and the AND gate 73 outputs the 
corresponding selection signal SEL2 to the selector 72. 
Simultaneously, the predecoder 71 extracts the branch des- 
tination address designated by that unconditional branch 
instruction and feeds it to the selector 72. The selector 72 
selects the branch destination address based on the selection 
signal SEL2 for feeding to the selector 11. Accordingly, the 
selector 11 outputs the branch destination address so that the 
register 12 is loaded with the branch destination address for 
feeding to the memory 20, the incrementer 13 and the tag 
register 31. 

In this preferred embc>diment, the instruction prefetch 
circuit continues to perform the prefetch operation even 



when the unconditional branch instruction is prefetched. 
Specifically, the prefetch operation is performed using the 
branch destination address designated by the unconditional 
branch instruction. As appreciated, when the instruction data 

5 D corresponding to (2) in FIG. 5 is the conditional branch 
instruction, the instruction prefetch circuit of this embodi- 
ment works essentially in the same manner as that of the 
foregoing first preferred embodiment. Further, when the 
circuit shown in FIG. 3 is applied to an advance-writable 

to cache device, since update can be continued to increase a hit 
ratio of the device even when an unconditional branch 
instruction is prefetched, the performance thereof is 
improved. 
Third Embodiment 

15 FIG. 6 is a block diagram showing an instruction prefetch 
circuit according to a third preferred embodiment of the 
present invention. In FIG. 6, the same or like elements are 
represented by the same symbols as those in FIG. 3. 
As shown in FIG, 6, the instruction prefetch circuit of this 

20 embodiment includes the address generating section 10, the 
memory 20, the determination signal generating section 30, 
and the data register 40 like the circuit shown in FIG. 3. The 
circuit of this embodiment includes a predecoder 81 and a 
selector 82. The predecoder 81 is connected to an output side 

25 of the data resister 40. The predecoder 81 is provided for 
detecting that the prefetched instruction D from the data 
register 40 is a conditional branch instruction and for 
extracting a branch destination address from the conditional 
branch instruction, The selector 82 is provided between an 

30 output side of the incrementer 13 and an input side of the 
selector 11. The selector 82 receives the incremented address 
from the incrementer 13 and further receives the branch 
destination address from the predecoder 81 when the 
prefetched instruction D is the conditional branch instruc- 

35 tion. The selector 82 selects one of them for feeding to the 
selector 11. 

A detection signal, indicative of whether the instruction 
data D is the conditional branch instruction, of the prede- 
coder 81 is inputted to one of input terminals of a two- input 
40 AND gate 83. To the other input terminal of the AND gate 
83 is inputted a given branch prediction bit of the prefetched 
instruction D outputted from the data register 40. An output 
of the AND gate 83 and the signal VALID are inputted to- 
input terminals of a two-input AND gate 84. The AND gate 
45 84 outputs a selection signal SEL2 to the selector 82 which 
switches selection between the addresses from the incre- 
menter 13 and the predecoder 81 based on the selection 
signal SEL2. On the other hand, a selection signal SEL1 is 
inputted to the selector 11. In this preferred embodiment, 
since prediction is performed whether the conditional branch 
instruction is actually executed or not, a signal MISS is 
further provided for loading the fetch address FA when the 
prediction fails. The signal MISS and the reset signal RST 
are inputted to an OR gate 85 an output of which is, in turn, 
inputted to a reset terminal R of a synchronous RS-FF 86. 
The RS-FF 86 outputs the selection signal SEL1 to the 
selector 11 which switches selection between the address 
from the selector 82 and the fetch address FA based on the 
selection signal SELL As an input signal to a set terminal S 
60 of the RS-FF 86, the mode setting signal SEL1 at the 
preceding cycle is used. 

FIG. 7 is a block diagram showing a circuit for generating 
the signal MISS. 

This circuit includes an FA change extracting section 91 
65 which receives the fetch address FA and extracts its change, 
a counter 92 for counting an output of the FA change 
extracting section 91, that is, the number of the fetch 
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addresses FA sent from the CPU, a coincidence circuit 93 for 
detecting that an output of the counter 92 is "1", an OR gate 
94 for deriving the logical sum of an output of the coinci- 
dence circuit 93 and the reset signal RST, and a synchronous 
RS-FF 95 having a reset terminal R which is inputted with 
an output of the OR gate 94. These elements cooperatively 
work for setting a timing of the signal MISS. The RS-FF 95 
has a set terminal S which is inputted with the detection 
result from the predecoder 81, that is, the detection signal 
indicative of whether the instruction data D is the condi- 
tional branch instruction. An output side of the RS-FF 95 is 
connected to an enable terminal E of the counter 92 and to 
one of input terminals of a twoinpM AND gate 96. An 
inverted value of the signal VALID is inputted to the other 
input terminal of the AND gate 96. The signal MISS is 
outputted from the AND gate 96. The circuit of FIG. 7 
checks whether the branch prediction has failed or not, and 
produces the signal MISS when it turns out that the branch 
prediction has failed. 

Now, an operation of the instruction prefetch circuit 
shown in FIG. 6 will be described hereinbelow. 

When the conditional branch instruction is detected while 
operating in the prefetch mode, it is estimated based on the 
branch prediction bit in the instruction field of the instruc- 
tion D whether the branching will be executed The branch 
prediction bit is prepared according to the known prediction 
algorithm when a compiler or programmer makes a pro- 
gram. According to an instruction set, the branch prediction 
bit may be in the form of a sign bit indicative of a direction 
(plus or minus direction) of the branch destination address. 

When the conditional branch instruction is detected as 
noted above, the corresponding detection signal is fed from 
the predecoder 81 to the set terminal S of the RS-FF 95 so 
as to set the RS-FF 95 and thus to set the counter 92 for 
counting the number of the fetch addresses FA sent from the 
CPU. Thereafter, since, in this embodiment, the branch 
destination address appears, if the branching is executed, in 
the second fetch address FA from the fetch address FA 
indicative of the corresponding conditional branch 
instruction, the RS-FF 95 is reset when the counter value 
becomes "1" Accordingly, if the signal VALID becomes "0", 
meaning "invalid", while the RS-FF 95 remains to be set and 
~ thus" holds the corresponding input terminal of the AND gate - 
96 at level "1", the signal MISS is outputted from the AND 
date 96 for resetting the RS-FF 86 so that the selector 11 
selects the fetch address FA sent from the CPU. 

As appreciated, as long as the branch prediction is correct, 
the instruction prefetch circuit of FIG. 6 performs the 
operation which is essentially the same as that shown in FIG. 
5. Specifically, the predecoder 81 fetches the branch desti- 
nation address designated by the prefetched conditional 
branch instruction and reeds it to the selector 82. The 
selector 82 selects the branch destination address based on 
the selection signal SEL2 for feeding to the selector 11. 
Accordingly, the selector 11 outputs the branch destination 
address so that the register 12 is loaded with the branch 
destination address for feeding to the memory 20, the 
incrementer 13 and the tag register 31. 

In this preferred embodiment, the instruction prefetch 
circuit continues to perform the prefetch operation irrespec- 
tive of whether the prefetched conditional branch instruction 
is executed or not, as long as the branch prediction does not 
fail. Further, when this preferred, embodiment is combined- 
with the foregoing second preferred embodiment, the 
instruction prefetch circuit is capable of continuing the 
prefetch operation under any of the stored instructions as 
long as the branch prediction does not fail. When the circuit 
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shown in FIG. 6 is applied to an advance-writable cache 
device, since update can be continued to increase a hit ratio 
of the device even when a conditional branch instruction is 
prefetched, the performance thereof is improved. 
5 Fourth Embodiment 

FIG. 8 is a block diagram showing an instruction prefetch 
circuit according to a fourth preferred embodiment of the 
present invention. In FIG. 8, the same or like elements are 
represented by the same symbols as those in FIG. 6. 

In the foregoing third preferred embodiment, the branch 
prediction bit is used as a prediction value for the branch 
prediction. On the other hand, in this preferred embodiment, 
branch prediction means 100 is provided for the branch 
prediction. Specifically, in FIG. 8, an output of the branch 
prediction means 100 is inputted to one of the input termi- 
15 nals of the AND gate 83 instead of the branch prediction bit. 
The other structure is essentially the same as that shown in 
FIG. 6. 

FIG. 9 is a block diagram showing the branch prediction 
means 100. The branch prediction means 100 is in the form 

20 of a random value generator constituted by a D-FF. An 
output terminal Q of the D-FF is connected to a data terminal 
D thereof. The branch prediction is performed using an 
output of the random value generator as a prediction value. 
Specifically, the output of the random value generator 

25 represents a random value of "1" or "0" depending on a 
timing. Since the mean probability of 44 1" or "0" in this 
random value generator is 0.5, a hit ratio becomes higher 
rather than fixing a prediction value to "1" or "0" in case 
determination of the prediction value is difficult 

30 As appreciated, the instruction prefetch circuit shown in 
FIG. 8 performs the prefetch operation similar to that of the 
foregoing third preferred embodiment. In case of an instruc- 
tion set with no branch prediction bit in the instruction field, 
the branch prediction can be achieved with the simplified 

35 hardware in this preferred embodiment. 
Fifth Embodiment 

FIG. 10 is a diagram showing another example of the 
branch prediction means 100 according to a fifth preferred 
embodiment of the present invention, 

40 As shown in FIG. 10, in this preferred embodiment, the 
branch prediction means 100 includes a history table 
memory 111, a comparator 112 and a two-input AND gate 

- - 113. The history table memory 111 stores past histories- of 
"branched" or "non-branched" in terms of address tags I-N. 

45 The history table memory 111 is inputted with low-order bits 
of the fetch address FA for outputting a corresponding 
address tag and a past history thereof. The comparator 112 
is inputted with high- order bits of the fetch address FA and 
the address tag outputted from the history table memory 111 

50 for comparison therebetween, A result of the comparison at 
the comparator 112 is inputted to one of input terminals of 
the AND gate 113. To the other input terminal of the AND 
gate 113 is inputted the corresponding past history outputted 
from the history table memory 111. As appreciated, an 

55 output of the AND gate 113 represents a branch prediction 
value and is fed to one of the input terminals of the AND 
gate 83 in FIG. 8. 

In this embodiment, even in case of an instruction set with 
no branch prediction bit in the instruction field, the branch 

60 prediction can be performed based on the past branch 
history. In FIG. 10, since it is impractical to provide the 
history table memory 111 for all the addresses, the so-called 
set-associative mode is adopted. Naturally, it is more effi- 
cient to adopt the so-called full-associative mode. When the 

65 fetch address FA is not found in the history table memory 
111, it is arranged that "non-branched" is outputted from the 
history table memory 111. 
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As described above, in this embodiment, the history table 
memory 111 is provided for performing the branch predic- 
tion based on the past branch history. Accordingly, the 
prefetched instruction can be fed to the CPU efficiently even 
in case of the instruction set having no branch prediction bit. 5 
As appreciated, in case of a program which requires execu- 
tion of a loop in plural times, a hit ratio of the branch 
prediction is improved. Further, in case of an advance- 
writable cache device, since update can be continued even 
when a conditional branch instruction is prefetched, a hit 1Q 
ratio of the device is increased to improve the performance 
thereof. 

In this embodiment, as described above, when the fetch 
address FA is not found in the history table memory "111, it 
is arranged that u non-branched" is outputted from the his- 
tory table memory 111. However, "branched" may be out- 15 
putted from the history table memory 111 by replacing the 
AND gate 113 with an OR gate and inverting the output of 
the comparator 112. 

While the present invention has been described in terms 
of the preferred embodiments, the invention is not to be 20 
limited thereto, but can be embodied in various ways with- 
out departing from the principle of the invention as defined 
in the appended claims. 

What is claimed is: 
' 1. An instruction prefetch circuit comprising: 25 
a memory prestoring instructions to be used in a 
processor, said memory prestoring said instructions 
corresponding to addresses; 
address generating means for incrementing a first instruc- ^ 
tion address inputted from said processor so as to 
generate a prefetch address for sending to said memory; 
data sending means for reading out the instruction corre- 
sponding to said prefetch address from said memory 
and for sending said readout instruction to said proces- 35 
sor; 

determination signal generating means for detecting 
whether or not said prefetch address agrees with a 
second instruction address inputted from said processor 
after said first instruction address, said determination ^ 
signal generating means supplying to said processor a 
determination signal indicative of said instruction sent 
-— - from said data sending means-being valid when said — 
prefetch address agrees with said second instruction 
address, while supplying to said processor a determi- 45 
nation signal indicative of said instruction sent from 
said data sending means being invalid when said 
prefetch address disagrees with said second instruction 
address; and 

selection signal generating means for sending a selection 50 
signal to said address generating means, said selection 
signal causing said address generating means to select 
said second instruction address as a new prefetch 
address other than an address derived by incrementing 
said first instruction address, only upon change of said 55 
determination signal from valid to invalid, 

2. An instruction prefetch circuit comprising: 

a memory prestoring instructions to be used in a 
processor, said memory prestoring said instructions 
corresponding to addresses; $0 

address generating means for incrementing a first instruc- 
tion address inputted from said processor so as to 
generate a prefetch address for sending to said memory; 

data sending means for reading out the instruction corre- 
sponding to said prefetch address from said memory 65 
and for sending said read-out instruction to said pro- 
cessor; 



1.2 

determination signal generating means for detecting 
whether or not said prefetch address agrees with a 
second instruction address inputted from said processor 
after said first instruction address, said determination 
signal generating means supplying to said processor a 
determination signal indicative of said instruction sent 
from said data sending means being valid when said 
prefetch address agrees with said second instruction 
address, while supplying to said processor a determi- 
nation signal indicative of said instruction sent from 
said data sending means being invalid when said 
prefetch address disagrees with said second instruction 
address; 

means for detecting that said instruction sent from said 
data sending means is an unconditional branch instruc- 
tion; and 

means for extracting a branch destination address from 
said unconditional branch instruction; 

said address generating means sending said branch des- 
tination address to said memory as a new prefetch 
address when said instruction sent from said data 
sending means is the unconditional branch instruction. 

3. An instruction prefetch circuit comprising: 

a memory prestoring instructions to be used in a 
processor, said memory prestoring said instructions 
corresponding to addresses; 

address generating means for incrementing a first instruc- 
tion address inputted from said processor so as to 
generate a prefetch address for sending to said memory; 

data sending means for reading out the instruction corre- 
sponding to said prefetch address from said memory 
and for sending said read-out instruction to said pro- 
cessor; 

determi nation signal generating means for detecting 
whether or not said prefetch address agrees with a 
second instruction address inputted from said processor 
after said first instruction address, said determination 
signal generating means supplying to said processor a 
determination signal indicative of said instruction sent 
from said data sending means being valid when said 
prefetch address agrees with said second instruction 
.address^ while supplying to said processor _a determi-_ 
nation signal indicative of said instruction sent from 
said data sending means being invalid when said 
prefetch address disagrees with said second instruction 
address; 

means for detecting that said instruction sent from said 
data sending means is a conditional branch instruction 
instructing a conditional branch; 

means for extracting a branch destination address from 
said conditional branch instruction; 

prediction means for predicting whether or not said con- 
ditional branch is actually executed; 

selection means for selecting, based on said branch 
prediction, one of an address derived by incrementing 
said first instruction address and said branch destination 
address; and 

means for outputting, based on a signal indicative of 
whether said branch prediction fails or not. one of said 
address selected by said selection means and a third 
instruction address being inputted from said processor, 
to said address generating means as a prefetch address. 

4. The instruction prefetch circuit according to claim 3, 
wherein said prediction means extracts a branch prediction 
bit in an instruction field of said conditional branch instruc- 
tion and uses it as a prediction value. 
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5. The instruction prefetch circuit according to claim 3, 
wherein said prediction means uses a random value as a 
prediction value. 

6. The instruction prefetch circuit according to claim 3, 
wherein said prediction means includes a table storing a past 
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branch history of the conditional branch instruction, said 
prediction means deriving a prediction value based on said 
past branch history. 

***** 
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